One-Annotated Constrained Sequence Alignment
نویسندگان
چکیده
The constrained multiple sequence alignment (CMSA) problem is to align a set of strings such that the given patterns (the constraint) appear in the same positions in a specified order in each of the strings in the resulting alignment. The best previous result for the pair-wise version takes O(mn) time and space [2, 10], where m is the number of patterns (defined later) and n is the maximum string lengths. In this paper, we deal with the pair-wise case when the positions of occurrences of the patterns in one of the strings are given. This version arises in applications naturally but is not discussed previously [8, 2, 10]. In this paper, we present an algorithm taking O(n) time and O(n + r) space for this version, where r is the number of occurrences of all the patterns. This result in turn improves the 2-approximation algorithm proposed in [2] for CMSA from O(Ckmn) time and O(kmn) space to O(Ckn) time and O(kn) space for the original problem, where k is the number of sequences and C is the maximum number of valid “constrained lists” (defined later). Key-Words: biological sequence comparison, constrained sequence alignment, computational biology
منابع مشابه
Constrained Sequence Alignment: A Dedicated Version and Its Applications
In this paper, we study a problem that arises naturally in biological applications. Given two sequences, along with a sequence of patterns, we want to align the two sequences such that the specified patterns are aligned together. This is the constrained sequence alignment problem and is defined in [14]. The multiple sequence version is called CMSA. In this paper, we focus on the pairwise versio...
متن کاملIdentifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++
Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individual positions of a multiple alignment and then defining constrained elements as segments of contig...
متن کاملAn Algorithm and Applications to Sequence Alignment with Weighted Constraints
Given two sequences S1, S2, and a constrained sequence C, a longest common subsequence of S1, S2 with restriction to C is called a constrained longest common subsequence of S1 and S2 with C. At the same time, an optimal alignment of S1, S2 with restriction to C is called a constrained pairwise sequence alignment of S1 and S2 with C. Previous algorithms have shown that the constrained longest co...
متن کاملA Parallel GPU-Designed Algorithm for the Constrained Multiple Sequence Alignment Problem
Modern graphical processing units (GPUs) offer much more computational power than modern CPUs, so it is natural that GPUs are often used for solving many computationally-intensive problems. One of the tasks of huge importance in bioinformatics is sequence alignment. We investigate its variant introduced a few years ago in which some additional requirement on the alignment is given. As a result ...
متن کاملORE extraction and blending optimization model in poly- metallic open PIT mines by chance constrained one-sided goal programming
Determination a sequence of extracting ore is one of the most important problems in mine annual production scheduling. Production scheduling affects mining performance especially in a poly-metallic open pit mine with considering the imposed operational and physical constraints mandated by high levels of reliability in relation to the obtained actual results. One of the important operational con...
متن کامل